A Bayesian approach to voice activity detection using multiple statistical models and discriminative training

نویسندگان

  • Tao Yu
  • John H. L. Hansen
چکیده

In this study, the problem of voice activity detection (VAD) is formulated in a Bayesian hypothesis testing framework. Unlike traditional VAD schemes that employ a single statistical model, multiple models are assumed to be potentially engaged with a priori probabilities, due to the statical diversity of the environmental noise degrading the speech. Moreover, the optimal a priori probabilities are explored using discriminative training based method, which is suggested to directly reduce themiss-hit rate and false-alarm rate of the VAD. As shown in the evaluations, VAD performance, both in terms of absolute performance and consistency across a diverse set of noise conditions, can be significantly improved using the proposed Bayesian method.

منابع مشابه

Voice Activity Detection Based on Discriminative Weight Training Incorporating a Spectral Flatness Measure

In this paper, we present an approach to incorporate discriminative weight training into a statistical model-based voice activity detection (VAD) method. In our approach, the VAD decision rule is derived from the optimally weighted likelihood ratios (LRs) using a minimum classification error (MCE) method. An adaptive online means of selecting two kinds of weights based on a power spectral flatn...

متن کامل

A statistical model-based voice activity detection employing minimum classification error technique

In this paper, we apply a discriminative weight training to a statistical model-based voice activity detection (VAD). In our approach, the VAD decision rule is expressed as the geometric mean of optimally weighted likelihood ratios (LRs) based on a minimum classification error (MCE) method. That approach is different from that of previous works in that different weights are assigned to each fre...

متن کامل

Speech Enhancement Using Gaussian Mixture Models, Explicit Bayesian Estimation and Wiener Filtering

Gaussian Mixture Models (GMMs) of power spectral densities of speech and noise are used with explicit Bayesian estimations in Wiener filtering of noisy speech. No assumption is made on the nature or stationarity of the noise. No voice activity detection (VAD) or any other means is employed to estimate the input SNR. The GMM mean vectors are used to form sets of over-determined system of equatio...

متن کامل

Voice activity detection based on statistical models and machine learning approaches

The voice activity detectors (VADs) based on statistical models have shown impressive performances especially when fairly precise statistical models are employed. Moreover, the accuracy of the VAD utilizing statistical models can be significantly improved when machine-learning techniques are adopted to provide prior knowledge for speech characteristics. In the first part of this paper, we intro...

متن کامل

A Bayesian Approach to Voice Conversion Based on GMMs Using Multiple Model Structures

A spectral conversion method using multiple Gaussian Mixture Models (GMMs) based on the Bayesian framework is proposed. A typical spectral conversion framework is based on a GMM. However, in this conventional method, a GMM-appropriate number of mixtures is dependent on the amount of training data, and thus the number of mixtures should be determined beforehand. In the proposed method, the varia...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

متن کامل
عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010